Analysis of Titanic Passengers Data¶


RMS Titanic – as everyone probably knows, but just in case I'll remind you, sank at 2:20 AM on the night of April 14th to 15th, 1912 after a collision with an iceberg in the Atlantic, during its maiden voyage from Europe to New York. Of the 2,208 people on board, only 712 survived, of which 6 died before the rescue ships with the survivors reached New York.

titanic i gora.jpg

The route of the ship¶

Initially, the Titanic’s voyage proceeded without incident. The ship departed on time from the port of Southampton in the United Kingdom, heading for Cherbourg in France. Because that port was too shallow to accommodate the Titanic, passengers and cargo were brought onboard with the help of the tender (a small auxiliary ship) S.S. "Nomadic" (which, incidentally, still operates today on the Seine as a sightseeing vessel, retaining almost its original appearance). Next, the Titanic sailed to Queenstown in Ireland, from where, after taking on the remaining passengers, it set out on the morning of April 11th on its voyage across the Atlantic to New York.

TitanicRoute.jpg

Passengers¶

A standard III class ticket cost a month's salary for a skilled worker. A II class ticket – a month's salary for a teacher or clerk. And a I class ticket – a month's salary for a doctor or engineer. Of course, we are talking here about monthly salaries in the United States; in Europe, wages were several times lower. In I class, apart from the standard cabins, there were also several suites with prices that could make your head spin. These were mainly used by millionaires traveling with their servants, butlers, and maids. Naturally, for each member of staff, they had to pay the fare of a regular I class ticket.

For this reason, I class was mainly occupied by very wealthy people and those from so-called “society”. II class was taken by small industrialists, clergymen, lawyers, journalists, teachers, and tourists from Europe. In third class, practically only economic migrants traveled. Most of them were Irish, English, and Swedes, but there were also 50 Bulgarians, 37 Croats, 70 Lebanese, and 80 Syrians—most of whom were Christians fleeing the Ottoman Empire from religious persecution.

Throughout the voyage, III class passengers were separated and confined to their section of the ship (although they had a dining room, and their own promenade deck where they held dances and social gatherings). This was not the whim of the ship owner, but a requirement of the American Immigration Office, because after arriving in the USA, I and II class passengers disembarked normally, while III class passengers were transported to Ellis Island, where they were photographed, fingerprinted, and checked to see if they appeared in criminal records or psychiatric hospital files. Officials also checked whether they had tuberculosis, trachoma (an infectious eye disease still found today in poor countries), or whether they were polygamists or anarchists. Only after such screening could they begin their new, beautiful lives in the USA.

Passenger arrangement on the ship¶

titanic-przekroj.jpg

The decks of the ship were marked with letters of the alphabet, starting from the highest boat deck, which mainly served as a promenade deck, and, in case of danger, also as an evacuation deck. Then the A deck, where the first-class lounges, library, and smoking room were located. B and C, which housed the first- and second-class cabins. D – the first- and second-class dining rooms. And E, F, and G, intended mainly for the cabins of the second and third classes and crew quarters. Lower down, partially already below the waterline, there were the cargo holds, boiler rooms, and engine room.

On the above diagram of the Titanic, I class areas are marked in yellow, II class areas in green, and III class passenger areas in brown. As you can see, the passengers of this class were separated. Single men occupied the bow section of the ship. Meanwhile, families and single mothers with children traveling to join their husbands, who had previously emigrated to the USA and were now able to pay for their families’ passage to join them, were accommodated at the stern.

The Accident¶

On that fateful day, April 14th, filet mignon (the most tender steak cut from the center of the tenderloin) with goose liver and artichokes, topped with truffle sauce, was served at first-class dinner. And to wash it down—champagne and rum sorbet. In third class, on the other hand, there were crackers with cheese and ham. That was still pretty good, since on most ships, third-class passengers had to bring their own food for the entire journey.

After dinner, when the passengers had already gone to sleep, at 11:40 PM an iceberg was spotted directly in front of the ship. The ship made a sharp turn to the left and half a minute later brushed against it with its starboard side. The advancing mass of ice bent the steel hull plates over a length of about 90 meters; the strain broke dozens of rivets holding the plates together, and water began to pour into the ship through small gaps (small, because their total size only slightly exceeded one square meter) inside the hull. The problem, however, was that water was flooding six of the forward compartments at once, while the ship could stay afloat with a maximum of four compartments flooded.

Cross_section_of_the_Titanic.jpg

Evacuation of passengers¶

After stopping the engines and a 40-minute inspection of the damage conducted by the captain together with the ship's carpenter and the main designer of the ship, engineer Thomas Andrews, it was determined that the cracks in the hull could not be sealed, and the pumps would not be able to handle the water entering the vessel. It was calculated that the ship would sink within a maximum of two hours. The captain therefore ordered an evacuation. An alarm was sounded, passengers were awakened, everyone was given life jackets, and the lowering of lifeboats began. To calm the nervous passengers, the orchestra started playing. The first lifeboat was lowered at 00:40 (one hour after the collision). The next boats were lowered at five-minute intervals.

Number of lifeboats¶

After the disaster, accusations were made about the insufficient number of lifeboats. In fact, the Titanic was equipped with the following lifesaving equipment: two small boats, numbered 1 and 2, which were usually used for communication with the shore and for the possible transport of a small number of people or cargo – each of which could hold a maximum of 40 people. There were 14 large rowboats, numbered 2 to 16, each of which could accommodate 65 people. Additionally, there were 4 cork rafts with raised canvas sides, each capable of holding a maximum of 47 people. The regulations at the time dictated the number of boats not by the number of passengers, but by the ship’s tonnage (displacement). And the Titanic exceeded these standards.

What’s more, its designers considered the rapid sinking of such a large ship to be practically impossible and designed the lifesaving equipment for much more probable incidents, such as a machinery failure on the open sea or a collision with another, smaller ship. In either case, if continuing the voyage was impossible, the boats were only intended to transfer passengers between the Titanic and vessels sent by the White Star Line, which would arrive at the site within a few, or at most several, hours. In such circumstances, the number of lifeboats on the Titanic was more than sufficient. However, in our specific case—the collision with the iceberg—the number of lifeboats proved to be tragically inadequate.

Number of people in the lifeboats¶

The fact that the boats lowered into the water were not fully filled with people was due to the regulations of the time. Namely, the boats, during lowering, could not be fully loaded so as not to break the davits from which they were being lowered. The rest of the passengers, already on the water, were supposed to be taken on board through the opened side doors. The sense of these regulations was demonstrated by an incident during the lowering of one of the last boats, which was already maximally filled with passengers. At that time, boat number 13 with 65 people on board (according to the "Encyclopedia Titanica," or 55 according to Wikipedia), during lowering, damaged the davits to the extent that the ropes running through them jammed, and the boat hung almost a meter above the water. And since boat number 15 with 68 people was being lowered right next to it, the two boats narrowly avoided colliding and crushing the passengers. To prevent this, at the last moment, sailors cut the ropes holding up the first boat, which dropped into the water with force. The commissions (both English and American) that later investigated the accident did not consider the actions of the officers overseeing the lowering of the boats to be a mistake. In the photograph below, you can see what the process of passengers passing through the side doors into the boats looked like. The fact that on the Titanic these doors were ultimately not opened and each lifeboat on the water was not filled to capacity with passengers is a different story.

ladowanie pasazerow wg instrukcji.jpg

The fact that the boats, especially those launched first, were not filled to a greater extent was also due to the fact that passengers were not very eager to leave the calm and brightly lit deck of the largest and most luxurious ship in the world, only to transfer in the middle of the night to a small, wooden dinghy, swaying eight stories above the icy waves of the Atlantic. All the more so since there was no apparent real danger. That is why some ladies actually had to be forcibly pushed into the first lifeboats. Of course, the situation changed dramatically after several dozen minutes, when it became clear that the ship was sinking, but by then most lifeboats were already on the water.

A graphic showing the subsequent stages of a ship sinking¶

Titanic_sinking_gif.gif

As you can see, it was only after an hour had passed since the collision that one could notice the first disturbing signs that the ship was sinking. After that, things progressed really quickly.


About Data:¶

The dataset contains information about the passengers of the RMS Titanic. The data includes such attributes as travel class, age, sex, number of siblings/spouses aboard, number of parents/children aboard, ticket fare, and the port of embarkation. The dataset also contains information about whether a given passenger survived the disaster.

Columns:

  • pclass - Ticket class
  • survived - Whether the passenger survived the disaster
  • name - Passenger's name and surname
  • sex - Passenger's gender
  • age - Passenger's age
  • sibsp - Number of siblings/spouses aboard
  • parch - Number of parents/children aboard
  • ticket - Ticket number
  • fare - Ticket fare
  • cabin - Cabin number
  • embarked - Port where the passenger boarded (C = Cherbourg, Q = Queenstown, S = Southampton)
  • boat - Lifeboat number
  • body - Body number (if the passenger did not survive and the body was found)
  • home.dest - Destination location
In [14]:
# sekcja importowa

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import matplotlib.patches as mpatches
from matplotlib.patches import Patch
import numpy as np
from scipy.interpolate import make_interp_spline
In [15]:
# wczytanie danych

df = pd.read_csv('26__titanic.csv', sep=",")

Questions and hypotheses:¶

Looking at the dataset, several questions immediately come to mind:

  1. What impact did age have on increasing the chances of survival? Maybe younger and more agile people had more chances to get into lifeboats than older, frailer people?
  2. Did gender matter? Maybe men chivalrously let women go first?
  3. Did belonging to a certain passenger class matter? And if so, in what way?
  4. Did having a spouse or siblings on board increase the chances of survival? (Perhaps a brother helped his sister, or a husband helped his wife?)
  5. Or maybe having children or parents decreased the chances of salvation? (Because you needed to take extra care of them and watch over them?)
  6. I wonder if there is any connection between the lifeboat number and the class you traveled in, or gender, or maybe age? And if so, what is it?
  7. Did waiting to board the lifeboat on a particular side of the ship (right or left) have any impact on increasing chances of survival?

I will try to find answers to these questions during further analysis.


1. Conclusions from the analysis of basic information about the data:¶

Bingo! We have 1309 records, which means we’re only missing data on 8 passengers (that’s much less than 1%). The second, even better piece of information is that we have a complete set of the data that interests us most, namely about the passengers who survived the disaster. As we know, 712 people survived the disaster in total, including 212 crew members, and we have data on 500 surviving passengers. So, in this regard, our data is complete and we can work with it. Therefore, we will be able to answer all the questions and hypotheses posed above.

It’s true that for 15% of survivors their age is unknown, and in 5% we don’t know the number of the lifeboat they got into, but these are not quantities that would significantly affect the quality of our analysis. The situation is much worse when it comes to the cabin occupied during the voyage. Here, missing data amounts to over 60% for survivors and as much as 90% for those who died. Fortunately, we won’t actually need this data for our further investigations.

In [16]:
# kilka losowych rekordow, zeby zorientowac sie, z czym sie mierzymy
df.sample(5)
Out[16]:
pclass survived name sex age sibsp parch ticket fare cabin embarked boat body home.dest
961 3.0 0.0 Lennon, Miss. Mary female NaN 1.0 0.0 370371 15.5000 NaN Q NaN NaN NaN
856 3.0 1.0 Healy, Miss. Hanora "Nora" female NaN 0.0 0.0 370375 7.7500 NaN Q 16 NaN NaN
238 1.0 1.0 Robert, Mrs. Edward Scott (Elisabeth Walton Mc... female 43.0 0.0 1.0 24160 211.3375 B3 S 2 NaN St Louis, MO
1066 3.0 0.0 Novel, Mr. Mansouer male 28.5 0.0 0.0 2697 7.2292 NaN C NaN 181.0 NaN
301 1.0 0.0 Walker, Mr. William Anderson male 47.0 0.0 0.0 36967 34.0208 D46 S NaN NaN East Orange, NJ
In [17]:
# ile mamy w bazie osob ktore przezyly, a ile tych, ktore nie przezyly?
df['survived'].value_counts()
Out[17]:
0.0    809
1.0    500
Name: survived, dtype: int64
In [18]:
# jak rozkladaja sie brakujace dane w obu tych grupach (survived i nonsurvived)
df[df['survived'] == 1.0].isnull().sum()
Out[18]:
pclass         0
survived       0
name           0
sex            0
age           73
sibsp          0
parch          0
ticket         0
fare           0
cabin        307
embarked       2
boat          23
body         500
home.dest    153
dtype: int64
In [19]:
df[df['survived'] == 0.0].isnull().sum()
Out[19]:
pclass         0
survived       0
name           0
sex            0
age          190
sibsp          0
parch          0
ticket         0
fare           1
cabin        707
embarked       0
boat         800
body         688
home.dest    411
dtype: int64

2. Conclusions from the analysis of single variables:¶

Due to the way passengers were distributed on the ship, each passenger class must be considered separately. A glance at the charts below clearly shows that women and children from I and II class were almost 100% rescued. From what I recall, the only child from I class who was not rescued was a 2-year-old girl who got separated from her parents somewhere while moving to the lifeboat deck. The parents went looking for her, and all three disappeared. Some women did not want to leave their husbands, and one her beloved dog—she was later found frozen in the water, still holding her equally frozen pet in her arms. However, these are individual cases. In general, being a woman or a child under the age of 15 from I or II class practically guaranteed survival. The situation was different in III class. Here, the chances of survival for women or children hovered around 50%.

As for men, it was a true hecatomb. Very few survived. However, there were some patterns. First of all, the percentage of survivors was higher the higher the class they belonged to. Secondly, most of the survivors were young men under the age of 35. Apart from I class, surviving older men were only individual cases. In this way, we’ve found answers to our first three questions:

Question number 1: What impact did age have on the chances of survival?¶

A very large one. For men, it was mainly the young who survived. In all classes, the average age of survivors was significantly lower than the pre-disaster average age of passengers. For II class, this average even dropped from 31 to 17 years.

Question number 2: Did gender matter?¶

Decisive! Over 95% of I class women survived, almost 90% from II class, and only about half of the women from III class. For men, only 35% from I class survived, and just 15% each from II and III class. Yet, we still don’t know why. We’ll address that question in a moment.

Question number 3: Did class membership matter?¶

Of course, absolutely. 62% of I class passengers survived, 43% from II class, and only 25% from III class.

Question number 4: Did having a spouse or siblings increase the chance of survival?¶

Calculating among the passengers who had siblings or a spouse, the percentage who were rescued is slightly higher than the passenger population overall. In I class it is 71% compared to 62%, in II class 53% compared to 43%, and in III class 26% compared to 25%. This means that having a spouse or sibling increases, albeit only by a few to a dozen percent, the chances of survival. Nevertheless, any increase is better than none. Although I must admit, I expected more spectacular results. Thus, the hypothesis from point 4 was, albeit "with a certain shyness," confirmed.

Question number 5: Did having children or parents reduce the chances of survival?¶

Absolutely not! In the case of having parents or children—people under our care—our chances of survival paradoxically increased. The proportions are, respectively, 73% versus 62% in I class, 77% versus 43% in II class, and 32% versus 25% in III class. This means that having children or parents was not a burden, but in fact significantly increased the chance of survival, from 20 up to even 80%! This means the hypothesis in point 5 is completely disproven.

Question number 7: Did waiting for a lifeboat on the right or left side of the ship influence the chance of survival?¶

It turns out, yes! Only 216 people boarded lifeboats from the left side, while 251 did so from the right, over 15% more. So, something as seemingly trivial as which side of the ship we tried to board the lifeboats from can seriously increase or decrease our chances of rescue. To a greater extent than having a helpful husband or sibling. So let’s find out why this was the case.

In [20]:
# ocaleni w podziale na plec i klase (w procentach)
# -------------------------------------------------

# grupowanie danych wedlug klasy i plci
grouped = df.groupby(['pclass', 'sex'])

# obliczanie liczby wszystkich pasazerow w kazdej grupie
total_counts = grouped.size().reset_index(name='total')

# obliczanie liczby osob, ktore przezyly w kazdej grupie
survived_counts = grouped['survived'].sum().reset_index(name='survived')

# zlaczenie obliczonych danych
merged = pd.merge(total_counts, survived_counts, on=['pclass', 'sex'])

# obliczanie procentowego udzialu osob, ktore przezyly
merged['survival_rate'] = ((merged['survived'] / merged['total']) * 100).round(1)

# tworzenie DataFrame dla sumowania wierszy Total
totals = merged.groupby('pclass').sum(numeric_only=True).reset_index()
totals['sex'] = 'Total'
totals['survival_rate'] = ((totals['survived'] / totals['total']) * 100).round(1)

# dodawanie linii sumujacych dla kazdej klasy
final_df = pd.concat([merged, totals], ignore_index=True)

# sortowanie wedlug klasy i plci (tak, zeby "Total" pojawialo sie na koncu kazdej klasy)
final_df['sex'] = pd.Categorical(final_df['sex'], categories=['female', 'male', 'Total'], ordered=True)
final_df = final_df.sort_values(by=['pclass', 'sex']).reset_index(drop=True)

# i voila!
final_df
Out[20]:
pclass sex total survived survival_rate
0 1.0 female 144 139.0 96.5
1 1.0 male 179 61.0 34.1
2 1.0 Total 323 200.0 61.9
3 2.0 female 106 94.0 88.7
4 2.0 male 171 25.0 14.6
5 2.0 Total 277 119.0 43.0
6 3.0 female 216 106.0 49.1
7 3.0 male 493 75.0 15.2
8 3.0 Total 709 181.0 25.5
In [22]:
# statystyki wieku dla mezczyz (poczatkowe i po ocaleniu)
# -------------------------------------------------------

# filtrowanie danych: wszyscy mezczyzni
all_males = df[df['sex'] == 'male']

# filtrowanie danych: tylko mezczyzni ktorzy przezyli
survived_males = all_males[all_males['survived'] == 1.0]

# funkcja liczaca statystyki z zaokragleniem
def compute_stats(grouped):
    stats = grouped.agg(['mean', 'std', 'median', lambda x: x.quantile(0.25), lambda x: x.quantile(0.75)])
    stats.columns = ['mean', 'std', 'median', 'Q1', 'Q3']
    return stats.round()

# grupowanie danych wg klas i obliczanie statystyk
statistics_all = compute_stats(all_males.groupby('pclass')['age'])
statistics_survived = compute_stats(survived_males.groupby('pclass')['age'])

# i wynik
print("Statystyki dla wszystkich mężczyzn:")
print(statistics_all.to_string(index=True))
print("\nStatystyki dla mężczyzn, którzy przeżyli:")
print(statistics_survived.to_string(index=True))
Statystyki dla wszystkich mężczyzn:
        mean   std  median    Q1    Q3
pclass                                
1.0     41.0  15.0    42.0  30.0  50.0
2.0     31.0  14.0    30.0  23.0  39.0
3.0     26.0  12.0    25.0  20.0  32.0

Statystyki dla mężczyzn, którzy przeżyli:
        mean   std  median    Q1    Q3
pclass                                
1.0     36.0  15.0    36.0  27.0  48.0
2.0     17.0  17.0    19.0   2.0  30.0
3.0     22.0  11.0    25.0  18.0  29.0
In [23]:
# rysowanie wyrabistych wykresow
#--------------------------------

# ustawienia stylu wykresow
sns.set(style="whitegrid")

# tworzenie subplotow dla wszystkich wykresow
fig, axs = plt.subplots(3, 2, figsize=(18, 15), sharey=False)

# kolory dla kazdej z klas
colors = {
    1.0: {'all': 'darkgray', 'survived': 'blue'},
    2.0: {'all': 'darkgray', 'survived': 'green'},
    3.0: {'all': 'darkgray', 'survived': 'red'}
}

# dodanie odstepu miedzy wykresami
plt.subplots_adjust(wspace=0.05, hspace=0.5)

# tworzenie wykresow dla kobiet oraz mezczyzn w poszczegolnych klasach
for i, pclass in enumerate([1.0, 2.0, 3.0]):
    df_class = df[df['pclass'] == pclass]
    
    ax_female = axs[i, 0]
    ax_male = axs[i, 1]

    females = df_class[df_class['sex'] == 'female']
    males = df_class[df_class['sex'] == 'male']
    
    # histogram dla kobiet
    all_females = females['age']
    survived_females = females[females['survived'] == 1.0]['age']
    
    ax_female.hist(all_females.dropna(), bins=range(0, 81, 5), color=colors[pclass]['all'], label='Kobiety', alpha=0.7, width=4)
    ax_female.hist(survived_females.dropna(), bins=range(0, 81, 5), color=colors[pclass]['survived'], label='Kobiety które przeżyły', alpha=0.7, width=4)
    ax_female.set_title(f'Klasa {int(pclass)} Kobiety', fontsize=16)
    ax_female.set_xlabel('Wiek (w latach)', fontsize=16)
    ax_female.set_ylabel('Ilość pasażerów', fontsize=16)
    
    if i < 2:  # zmieniamy skalę pionowa na 30 dla dwoch pierwszych rzedow
        ax_female.set_ylim(0, 30)
    elif i == 2:  # zmieniamy skale dla trzeciego rzedu dla kobiet na 40
        ax_female.set_ylim(0, 40)
    ax_female.legend(fontsize=14)

    # histogram dla mezczyzn
    all_males = males['age']
    survived_males = males[males['survived'] == 1.0]['age']
    
    ax_male.hist(all_males.dropna(), bins=range(0, 81, 5), color=colors[pclass]['all'], label='Mężczyźni', alpha=0.7, width=4)
    ax_male.hist(survived_males.dropna(), bins=range(0, 81, 5), color=colors[pclass]['survived'], label='Mężczyźni którzy przeżyli', alpha=0.7, width=4)
    ax_male.set_title(f'Klasa {int(pclass)} Mężczyźni', fontsize=16)
    ax_male.set_xlabel('Wiek (w latach)', fontsize=16)
    ax_male.set_ylabel('Ilość pasażerów', fontsize=16)
    
    if i < 2:  # zmieniamy skale pionowa na 30 dla dwoch pierwszych rzedow
        ax_male.set_ylim(0, 30)
    elif i == 2:  # zostawiamy wyzsza skale dla trzeciego rzedu na 90
        ax_male.set_ylim(0, 90)
    ax_male.legend(fontsize=14)

    # dodanie poziomych linii na wszystkich wykresach
    ax_female.yaxis.grid(True, linestyle='-', which='both', color='grey', alpha=0.7)
    ax_male.yaxis.grid(True, linestyle='-', which='both', color='grey', alpha=0.7)

plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()
No description has been provided for this image
In [24]:
# liczenie szans na przezycie dzieciatych i mezatych
# --------------------------------------------------

# funkcja obliczanie procentow osob ktore przezyly do tych ktore nie przezyly
def calculate_percentage(survived_count, non_survived_count):
    total = survived_count + non_survived_count
    survived_percentage = ((survived_count / total) * 100)
    non_survived_percentage = ((non_survived_count / total) * 100)
    return survived_percentage, non_survived_percentage

# inicjalizacja slownikow do przechowywania rezultatow
sibs_results = {}
parch_results = {}

# zdefiniowanie listy klas
classes = [1.0, 2.0, 3.0]

# Loop
for cls in classes:
    survived_sibs = df[(df['survived'] == 1.0) & (df['sibsp'] > 0) & (df['pclass'] == cls)].shape[0]
    non_survived_sibs = df[(df['survived'] == 0.0) & (df['sibsp'] > 0) & (df['pclass'] == cls)].shape[0]
    sibs_results[cls] = calculate_percentage(survived_sibs, non_survived_sibs)
    
    survived_parch = df[(df['survived'] == 1.0) & (df['parch'] > 0) & (df['pclass'] == cls)].shape[0]
    non_survived_parch = df[(df['survived'] == 0.0) & (df['parch'] > 0) & (df['pclass'] == cls)].shape[0]
    parch_results[cls] = calculate_percentage(survived_parch, non_survived_parch)

# i wyniczki
print("\nPasażerowie z rodzeństwem lub współmałżonkiem:\n")
for cls, res in sibs_results.items():
    print(f"Klasa {cls}: Survived {round(res[0])}%, Not Survived {round(res[1])}%")

print("\nPasażerowie z dziećmi bądź rodzicami:\n")
for cls, res in parch_results.items():
    print(f"Klasa {cls}: Survived {round(res[0])}%, Not Survived {round(res[1])}%")
Pasażerowie z rodzeństwem lub współmałżonkiem:

Klasa 1.0: Survived 71%, Not Survived 29%
Klasa 2.0: Survived 53%, Not Survived 47%
Klasa 3.0: Survived 26%, Not Survived 74%

Pasażerowie z dziećmi bądź rodzicami:

Klasa 1.0: Survived 73%, Not Survived 27%
Klasa 2.0: Survived 77%, Not Survived 23%
Klasa 3.0: Survived 32%, Not Survived 68%
In [25]:
# dodanie kolumny 'burta': prawa i lewa
#-------------------------------------

# funkcja okreslajaca wartosci dla nowej kolumny 'burta'
def determine_burta(boat):
    if boat in ['2', '4', '6', '8', '10', '12', '14', '16', 'B', 'D']:
        return 'L'
    elif boat in ['1', '3', '5', '7', '9', '11', '13', '15', 'A', 'C']:
        return 'P'
    else:
        return None

# Dodawanie nowej kolumny 'burta' do bazy danych
df['burta'] = df['boat'].apply(determine_burta)
In [26]:
# ocaleni z lewej burty
survived_left_df = df[(df['survived'] == 1.0) & (df['burta'] == 'L')].shape[0]
print(survived_left_df)
216
In [27]:
# ocaleni z prawej burty
survived_right_df = df[(df['survived'] == 1.0) & (df['burta'] == 'P')].shape[0]
survived_right_df
Out[27]:
251

3. Conclusions from the Demographic Analysis of Individual Lifeboats:¶

Until now, we have considered our data in a static way: so many people saved, so many drowned. However, the passengers were not saved in the blink of an eye. It was a process stretched out over time. The lifeboats were successively filled with people and lowered onto the water—minute by minute, hour by hour. It is interesting to consider what conclusions we might reach if we analyze the process of rescuing passengers as one characterized by a certain dynamism, changing over time.

Let’s imagine it. With each passing moment, the Titanic is sinking deeper. The deck is no longer perfectly level, but tilts toward the bow, submerged in water. It’s now certain that the ship will sink. The cold air is occasionally pierced by the piercing whistle of steam released from the boilers, to prevent them from exploding in contact with the icy water. The deck boards creak under the pressure of hundreds of feet. Tension rises. The sailors lower the next lifeboats onto the water faster and faster. Some passengers pray aloud, others curse the world. The orchestra plays increasingly sad and solemn tunes. Clergymen give last rites and lead prayers. Everyone will soon stand before the Lord. Some passengers peacefully accept their fate; others desperately search for salvation. The threat of imminent death becomes more real with every passing minute. Panic begins to grip the people. In such circumstances, the best and worst traits of our character reveal themselves.

Question numer 6: Is There Any Relationship Between the Lifeboat Number and the Class or Gender?¶

Looking at the graphs below, we can clearly see how the composition of the passengers placed in the lifeboats changed over time. In the first seven boats that were launched, almost 100% of the passengers were from I cass. Next, there is boat number 16, which is a strong anomaly as it is filled exclusively by members—or rather, women—from III class. The next four boats are predominantly filled with passengers from II class, with only small numbers from the other classes. In the following boats, people from III class begin to dominate, probably because the women and children from the higher classes “ran out”. By “ran out” I mean, of course, that there were no more of them left on the evacuation deck. The last few boats are filled rather randomly, but are still mostly dominated by III class passengers. The class distribution is especially clearly visible in the case of boats launched from the starboard side. The officer responsible for filling and launching the lifeboats on this side was First Officer Murdoch. As can be seen, he strictly maintained order, and at the same time saved several dozen more passengers than the officer in charge on the port side, Second Officer Lightoller. Including crew members, Murdoch saved over 100 more people. So, it turns out that order during evacuation adds a “plus one” to rescue outcomes.

The charts about the gender of rescued persons, broken down by lifeboat, show something interesting as well. Here, we can see a huge disparity between the lifeboats lowered from the starboard side and those from the port side. This resulted from the different interpretations of Captain Smith’s order about saving women and children. Second Officer Lightoller, who supervised launching lifeboats from the port side, interpreted the captain’s command as: ONLY women and children, while First Officer Murdoch, in charge on the starboard side, understood it as: FIRST women and children. Therefore, in “his” lifeboats, the empty seats were also taken by men. It was thanks to this that in the end, he saved more people. So, it is not blind obedience to orders, but flexible adaptation to changing circumstances, that increases the chances of survival.

Analysis of the crew composition of consecutively launched lifeboats.¶

Diagram of lifeboat arrangement on the deck:¶

rozmieszczenie lodzi ratunkowych.jpg

Times of launching the individual lifeboats:¶

Czas zwodowania Numer szalupy Pojemnosc standardowa Pojemnosc w momencie opuszczania
00:40 7 65 28
00:45 5 65 36
00:55 3 65 32
01:00 8 65 25
01:05 1 40 12
01:10 6 65 22
01:20 16 65 53
01:25 14 65 40
01:30 9 65 40
01:30 12 65 42
01:35 11 65 50
01:35 13 65 55
01:40 15 65 68
01:45 2 40 17
01:50 10 65 57
01:50 4 65 30
02:00 C 47 43
02:05 D 47 20
02:14 A 47 12
02:15 B 47 12
------------------------------------------------------------------ ------------------------------------------------------------------ ------------------------------------------------------------------ ------------------------------------------------------------------
Calkowite ilosci: 20 1178 686

The last rafts to be launched, marked A and B, were no longer lowered normally, but rather simply washed off the deck of the sinking Titanic by the waves. Raft A took on a lot of water, and the passengers who managed to swim to it and climb aboard had to stand knee-deep in water. More than half of them died of hypothermia. Raft B, on the other hand, capsized and ended up upside down, with only a dozen or so people managing to climb onto it and somehow hold on. Among them was Second Officer Lightoller, who had a whistle. This ended up saving them, as it was his whistle that the rescue teams heard—teams that were already turning back, believing there would be no more survivors. As you can see, a whistle gives you a plus one to being rescued.

We also have to make a slight correction to the order in which boats were launched. Specifically, boat number 4, although it began to be lowered as one of the first—even before boat number 6—was stopped at the level of deck A, from where it continued to take on passengers. Thus, it was actually launched as one of the last boats. Since what matters most for us is the actual moment passengers boarded the boats (not when they hit the water), I decided this change makes sense.

We should also not be misled by the absurdly low occupancy of some boats, such as lifeboat number 1. Let’s not forget that in addition to the passengers, 212 crew members were also saved. They had to fit somewhere. Each lifeboat was typically assigned 4 sailors to row and one petty officer or junior officer to command the boat. In the case of lifeboat number 1, it also took on an additional 10 stokers (not cigarette smokers, but boiler room crew), who, covered in soot, had just made their way to the evacuation deck through a ventilation shaft, fleeing from the flooding boiler room.

In [34]:
# czadowe wykresy pokazujace rozklad klas w poszczegolnych lodziach
# -----------------------------------------------------------------

# funkcja obliczajaca liczbe pasazerow dla kazdej klasy w kazdej z lodzi
def count_passengers_by_class(df, boat_order):
    results = []
    for boat in boat_order:
        for cls in [1.0, 2.0, 3.0]:
            count = len(df[(df['boat'] == boat) & (df['pclass'] == cls)])
            results.append({'boat': boat, 'class': cls, 'count': count})
    return pd.DataFrame(results)

# kolejnosc wodowania szalup
boat_order = ['7', '5', '3', '8', '1', '4', '6', '16', '14', '9', '12', '11', '13', '15', '2', '10', 'C', 'D', 'A', 'B']

# liczenie pasazerow wedlug klasy i lodzi
data = count_passengers_by_class(df, boat_order)

# filtracja danych na prawa i lewa burte
data_right = data[data['boat'].isin([boat for boat in boat_order if determine_burta(boat) == 'P'])]
data_left = data[data['boat'].isin([boat for boat in boat_order if determine_burta(boat) == 'L'])]

# tworzenie subplots
fig = plt.figure(figsize=(18, 12))
gs = fig.add_gridspec(2, 2, width_ratios=[1, 1], height_ratios=[1, 1])

# gorny sumaryczny wykres na caly pierwszy rząd (obie kolumny)
ax_big = fig.add_subplot(gs[0, :])
# dwa ponizsze
ax_right = fig.add_subplot(gs[1, 1])  # prawy dolny
ax_left = fig.add_subplot(gs[1, 0])   # lewy dolny

# Funkcja
def plot_barchart(data, ax, title, show_ylabel=False):
    sns.barplot(x='boat', y='count', hue='class', data=data, 
                palette={1.0: 'blue', 2.0: 'green', 3.0: 'red'}, 
                ax=ax, legend=False)
    ax.set_title(title)
    ax.set_ylabel('Liczba pasażerów' if show_ylabel else '')
    ax.set_xlabel('Numer szalupy')
    for label in ax.get_xticklabels():
        label.set_fontweight('bold')
    ax.grid(False, axis='x')
    ax.grid(True, axis='y', linestyle=':', linewidth=1, color='black', alpha=0.5)

plot_barchart(data, ax_big, 'Ilość pasażerów w kolejnych wodowanych łodziach w podziale na klasy', show_ylabel=True)
plot_barchart(data_right, ax_right, 'Tylko z prawej burty')
plot_barchart(data_left, ax_left, 'Tylko z lewej burty')

# Synchronizacja osi Y
for ax in [axes[0,1], axes[1,0]]:
    ax.set_yticks(axes[0,0].get_yticks())
    ax.set_ylabel('')

legend_patches = [
    Patch(facecolor='blue', label='Klasa 1'),
    Patch(facecolor='green', label='Klasa 2'),
    Patch(facecolor='red', label='Klasa 3')
]
fig.legend(handles=legend_patches, loc='upper center', ncol=3, title='Legenda', bbox_to_anchor=(0.5, 0.93))

plt.tight_layout()
plt.show()
No description has been provided for this image
In [40]:
# wyrabista funkcja pokazujaca rozklad plci i dzieci w poszczegolnych lodziach
# ----------------------------------------------------------------------------

# funkcja obliczajaca liczbe pasazerow wedlug plci i wieku w kazdej z lodzi
def count_passengers_by_sex_and_age(df, boat_order):
    results = []
    for boat in boat_order:
        for sex in ['female', 'male']:
            total_count = len(df[(df['boat'] == boat) & (df['sex'] == sex)])
            children_count = len(df[(df['boat'] == boat) & (df['sex'] == sex) & (df['age'] < 15)])
            results.append({'boat': boat, 'sex': sex, 'total_count': total_count, 'children_count': children_count})
    return pd.DataFrame(results)

# liczenie pasazerow wedlug plci i wieku
data_sex_age = count_passengers_by_sex_and_age(df, boat_order)

# filtracja danych na prawej i lewej burcie
data_right_sex = data_sex_age[data_sex_age['boat'].isin([boat for boat in boat_order if determine_burta(boat) == 'P'])]
data_left_sex = data_sex_age[data_sex_age['boat'].isin([boat for boat in boat_order if determine_burta(boat) == 'L'])]

# tworzenie subplots
fig = plt.figure(figsize=(18, 12))
gs = fig.add_gridspec(2, 2, width_ratios=[1, 1], height_ratios=[1, 1])

# gorny sumaryczny wykres na caly pierwszy rząd (obie kolumny)
ax_big = fig.add_subplot(gs[0, :])
# dwa ponizsze
ax_right = fig.add_subplot(gs[1, 1])  # prawy dolny
ax_left = fig.add_subplot(gs[1, 0])   # lewy dolny

# funkcja do rysowania wykresow
def plot_barchart_by_sex(data, ax, title):
    colors = {'female': 'deeppink', 'male': 'deepskyblue'}
    bar_width = 0.4
    boats = data['boat'].unique()

    for sex in ['female', 'male']:
        sex_data = data[data['sex'] == sex]
        x = [i + (-bar_width/2 if sex == 'female' else bar_width/2) for i in range(len(boats))]
        ax.bar(x, sex_data['total_count'], bar_width, color=colors[sex], label='Kobiety' if sex == 'female' else 'Mężczyźni')
        for index, row in enumerate(sex_data.iterrows()):
            ax.bar(x[index], row[1]['children_count'], bar_width, color=colors[sex], hatch='//', edgecolor='black')

    ax.set_title(title)
    ax.set_ylabel('Liczba pasażerów' if ax is axes[0] else '')
    ax.set_xlabel('Numer szalupy') 
    ax.set_xticks(range(len(boats)))
    ax.set_xticklabels(boats, fontweight='bold')
    ax.grid(False, axis='x')
    ax.grid(True, axis='y', linestyle=':', linewidth=1, color='black', alpha=0.5)
    
plot_barchart_by_sex(data_sex_age, ax_big, 'Ilość pasażerów w kolejnych wodowanych łodziach w podziale na płeć')
plot_barchart_by_sex(data_right_sex, ax_right, 'Tylko z prawej burty')
plot_barchart_by_sex(data_left_sex, ax_left, 'Tylko z lewej burty')

axes = [ax_big, ax_left, ax_right]  # dla dalszych operacji, np. y-ticks

# ustawienia osi Y
for ax in axes[1:]:
    ax.set_yticks(axes[0].get_yticks())
    ax.set_ylabel('')

# dodanie legendy
handles, labels = axes[0].get_legend_handles_labels()
handles.append(Patch(facecolor='white', hatch='//', edgecolor='black', label='Dzieci'))
fig.legend(handles=[handles[0], handles[1], handles[-1]], labels=['Kobiety', 'Mężczyźni', 'Dzieci'], loc='upper center', ncol=1, title='Legenda', bbox_to_anchor=(0.5, 0.93))

plt.tight_layout()
plt.show()
No description has been provided for this image

4. Summary¶

In the course of our analysis, we have answered all the questions and hypotheses posed at the outset. We now know that being a woman or a child in I or II class gave an enormous, almost one hundred percent, chance of survival. This was, of course, due to the social norms of the time, in which women, considered the weaker sex, were always to be given way, allowed to go first, or offered a seat at the table. I wonder if nowadays gentlemen would still behave so chivalrously? Especially if their own lives, and not just common courtesy, were at stake. It's true that even on the Titanic, officers supervising the evacuation of passengers had to use their guns several times to enforce order and drive away men from the "lower social classes" from storming the lifeboats when the ladies were boarding. I also wonder whether, on modern large liners, in the event of a pirate attack, terrorist act, or missile strike by the Houthis, a first-class ticket would still be a ticket to survival? Even if not, it's probably still better to be rich :-)

Returning to the Titanic: so many III class passengers died because they simply could not get to the deck. The stewards, who were the only ones with keys to the gates separating III class passengers from the rest of the ship, were very busy waking up higher-class passengers, handing out life jackets, directing them to the evacuation deck and the next lifeboats, searching for lost children, etc. As a result, as numerous accounts from survivors confirm, some of the gates remained locked until the very end. So even if III class passengers eventually managed to make their way out of the ship’s interior and find their way through the water-flooded corridors to the highest evacuation deck, most of the lifeboats had already departed.

A little fun fact to conclude: The movie "Titanic" from 1997, directed by James Cameron and starring Leonardo DiCaprio and Kate Winslet, cost more to produce (adjusted for today’s money) than building the actual Titanic did back in 1911.

In [ ]: